A Logistic Based Mathematical Model to Optimize Duplicate Elimination Ratio in Content Defined Chunking Based Big Data Storage System

نویسندگان

  • Longxiang Wang
  • Xiaoshe Dong
  • Xingjun Zhang
  • Fuliang Guo
  • Yinfeng Wang
  • Weifeng Gong
چکیده

Longxiang Wang 1, Xiaoshe Dong 1, Xingjun Zhang 1,*, Fuliang Guo 1, Yinfeng Wang 2 and Weifeng Gong 3 1 The School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China; [email protected] (L.W.); [email protected] (X.D.); [email protected] (F.G.) 2 The Shenzhen Institute of Information Technology, Shenzhen, 518172, China; [email protected] 3 State Key Laboratory of High-End Server & Storage Technology, Jinan 250101, China; [email protected] * Correspondence: [email protected]; Tel.: +86-029-8266-8478

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Dynamic Deduplication Approach for Big Data Storage

As data is increasing every day, so it is very challenging task to manage storage devices for this explosive growth of digital data. Data reduction has become very crucial problem. Deduplication approach plays a vital role to remove redundancy in large scale cluster computing storage. As a result, deduplication provides better storage utilization by eliminating redundant copies of data and savi...

متن کامل

Bimodal Content Defined Chunking for Backup Streams

Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are well established methods of separating a data stream into variable-size chunks such that duplicate content has a good chance of being discovered irrespective of its position in the data stream. Requirements for CDC incl...

متن کامل

Survey of Research on Chunking Techniques

The explosive growth of data produced by different devices and applications has contributed to the abundance of big data. To process such amounts of data efficiently, strategies such as De-duplication has been employed. Among the three different levels of de-duplication named as file level, block level and chunk level, De-duplication at chunk level also known as byte level is the most popular a...

متن کامل

Two Stage Max Gain Content Defined Chunking for De- duplication

––Data de-duplication is a very simple concept with very smart technology associated in it. The data blocks are stored only once, de-duplication systems decrease storage consumption by identifying distinct chunks of data with identical content. They then store a single copy of the chunk along with metadata about how to reconstruct the original files from the chunks, this takes up the less stora...

متن کامل

Offline Selective Data Deduplication for Primary Storage Systems

Data deduplication is a technology that eliminates redundant data to save storage space. Most previous studies on data deduplication target backup storage, where the deduplication ratio and throughput are important. However, data deduplication on primary storage has recently been receiving attention; in this case, I/O latency should be considered equally with the deduplication ratio. Unfortunat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Symmetry

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2016